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ABSTRACT 


This paper introduces a new approach to coding ultrasound video, the intended application being very low bit rate 
coding for transmission over low cost phone lines. The method exploits both the characteristic noise and the quas^ periodic 
nature of the signal. Data compression ratios between 250:1 and 1000:1 are shown to be possibly which is sufficient for 
transmission over ISDN and conventional phone lines. Preliminary results show this approach to be promising for remote 
ultrasound examinations. 
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1 INTRODUCTION 


Ultrasound video is a very cost effective diagnostic modality, and thus is widely used throughout this country and 
the world. Although ultrasound equipment is often available in rural and remote comers of the country, specialists to 
interpret data are typically in short supply in these locations. With the interest and support in telemedicine, the notion of 
having specialists perform ultrasound examinations at remote locations via electronic data exchange is very attracuve^ In 
the absence of channel bandwidth constraints, such an approach is straightforward, with high potential benefits related to 
providing immediate care and lowering overall expense. Unfortunately, many of these remote onions do w *\ have «cess 
to or cannot afford to use high capacity channels (such as T1 lines) to interface with large well-staffed urban medical centers 

where such specialists reside. 

In the presence of channel bandwidth constraints, this approach is encumbered by the large volume of data associated 
with digital video. Effective compression of the ultrasound prior to transmission will allow this data transfer to occur. The 
key is to achieve sufficient compression with acceptable reconstruction quality at rates compatible with telephone and ISDN 
lines. In this work we consider ultrasound video of the heart, where the remote examination involves a specialist a remote 
location guiding the attending practitioner by telephone. A critical part of this examination is obtaining proper positioning 
of the ulSsound probe, so thata diagnosis can be made. The transmitted video quality standards for positioning purposes 
areclearh^not as stringent as those for diagnosis. If positioning quality can be achieved, then higher quality video can be 
transmitted in a-non-ril time mode for diagnosis. Of course we hope to eventually be able to transmit diagnostic quality 
SEES £ rL ££ but this not yet in reach.’ Regardless, foe approach outlined above is a marked improvement ,n terms 

of accuracy and speed over sending video tapes by courier. 

•This work was supported in part by the National Science Foundation under contract MIP-91 16113 and by NASA. 


TV un>et goals imply compression ratios in the range from 250:1 to 1000:1. An obvious first fine of uuok on this 
VlSiS« re whi exrent spatial and temporal sampling (ire. firame sire and (ram. rare 1 « .be decmtared 
Without significantly imputing the quality. TOs has Ore advantage of being 

fmm the Medical College of Georgia, a 4:1 reduction m spatial resolution to a size of 256 x 256 J % P 

However, the full 30 f/s frame rate was recommended, particularly for pediatric cardiology where e cart rates are o en 

very high. 

Conventional coding methods such as H.261 and MPEG are not well suited to ultrasound video. Tbtjm ^rates tend Ito 
be too high and they have difficulty representing the high frequency information in the input. Model-based methods on the 
5£°|5?« kS- for high compression ratios but suffer typically from variegated performance behavior over a w.de 

variety of inputs. 

In this paper we introduce a model-based method that provides both high compression and robust behavior. To meet the 
difficu^compression requirements imposed by the telephone bandwidth, it is important to identify and Iwploit all avwlabfr 
“ signal and preserve with fidelity those parts of the signal that we important for expert analysis. In the 

Lseofultrasound video in cardiology, for example, cardiologists must be able to see the shape of the walls, theshape an 
^ess™vT c rr d the tissu^ texture. By taking into account the nature of the noise/texture associated with the 
ultrasound images and identifying the important components (wall boundaries, valves, etc.), we formu at a visu m 
that can be used for very low bit rate coding. 


2 SYSTEM DESCRIPTION 


The components of the proposed coding system are outlined in Figure 1. First, each input frame j is decomposed 
non^TTo two Pnponenu: a lowpass component, which is denoted by f„, j; and a highpass or texture component, 
T 3 number, i Z the row number, and j is the column number. The ^position is M I on a 

signal model and is optimized empirically such that the lowpass component contains most of the ' ^ £ 

diagnosis such as the contours of walls and valves. The highpass component contains information about the texture of the 
risTe Sg examined. The non-linear subband decomposition (upon which we elaborate later) is shown as ^ ^ block 
“ Figure " After the decomposition, the lowpass component is then decimated in i and j to the Nyquist rate. Signal c^mg 
Is Aen ^formed using an ojrimized subband coding method recently developed in the digital signal processing laboratory 
at Georgia Tech.' Some details of this method are presented in a later section. 

The lower branch of the system contains the texture information. It is decimated in the temporal domain andencoded 
usin^TnTouTversion of entropy -constrained residual scalar quantization ’ The two encoded components are then Ume- 
multiplexed into the narrowband telephone channel for transmission to the remote location. At the receiver, the ch 
signals are demultiplexed and the individual components are decoded. The lowpass component is then upsampl 
interpolated spatially to restore it to is proper size and the texture component is upsampled (temporal ^ ^the ^n^ame 
rate^After the components are restored, they are combined the 2-D nonlinear synthesis «ecbon to ifom rownstructed 

video In the next sections, we take a closer look at the individual operations shown in the block diagram in Figure 1 


3 NONLINEAR SUBBAND DECOMPOSITION 

T\vo particular characteristics of the ultrasound video signal support the idea of using the model-based decom^tion. 
First, if astatic tissue is examined, the ultrasound image can be interpreted as the product of a luminance lowpass componen , 
reoresentinfi the intensity of the ultrasonic wave in the vicinity of the examined tissue, and a constant reflectance compon , 

Presenting the reflection coefficients associated with the tissue. Second, ultrasound images are typically very noisy. 
Usually, additive noise models are used to describe the effect of noise in images. Filtering out the noise could enhance the 
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Figure 1: The components of the coding system: (a) sender, (b) receiver 














imaees but more important it makes the image easier to code. If the noise has a gaussian distribution then a linear filter is 
for maximizing the signal-to-noisemUo. In our erne, however, the goal is to ma,,nu« fo= subjechve quaht, of foe 

lowpass component. 

Thus two approaches can be considered: an additive model and a multiplicative model. A model formulation that covers 
bothadd.tive md multiplicative variates is depicted in Figure 2. It is similar in nature to the homomorphic model pioneered 
by Stockham 3 for the purpose of image enhancement. 

The filter H(u>), shown in Figure 2, is a lowpass filter with a cutoff frequency of w e = r/D\. The nonlinear 
decomposition is then described by the equation 

This decomposition is equivalently a nonlinear subband decomposition. The nonlinearity 'F(-) is chosen to be of the form 

'¥(x) = fix*. 

Fo r p — i and o = 1, ¥(•) is the identity mapping and we obtain the additive model. Fora = 0.231, 'F(x) cr ^log(x) in 
the range 0 to 255, and we obtain the multiplicative model. 

The parameter 0 was chosen so that * and 'F(x) have the same dynamic range, i.e. from 0 to 255. The P«in^a 
was choL empirically to optimize the subjective performance. Qualitatively, we want the lowpass component to contain 
as much useful detail as possible, while keeping constant the cutoff frequency of the filter H(u). To quantify this criterion 
we could try to minimize the difference between /„,, j and j to address the aforementioned goal Similarly, we could 
try to minimize the energy in the texture This ensures that the amount of information contained in the texture is not 

significant We have measured these quantifies for values of a in the range 0.1 < o < 2 for a sample set of 
images and the results are summarized in Figure 3. The graph (a) shows the dependency of the mean square difference 
between l n ,ij and x n<i j and the graph (b) shows the dependency of the energy of t„.i j on the parameter a. 

We can see that the two criteria are conflicting, and a compromise between them is needed. The value a = 0.23! 1 1 that 
implies an approximately logarithmic mapping is in the range of values that provide a good tradeoff between the two cntena. 
Therefore, the multiplicative model is a reasonable model to use for the encoding of ultrasound images. 

The additive and multiplicative models are compared in Figure 6. A s«n P le original ultrasound itm*e is Panted 
together with the reconstructed images obtained by using the additive (o = 1) and multiplicative (a - 0.23 1 ) models, 
can see that the multiplicative model has improved subjecUve appearance. 

Because this decomposition is similar to the homomotphic luminance-reflectance decomposition introduced by Stock- 
ham 3 in the context of image enhancement, we can also hope to be able to introduce some image enhancement capability 
to the ultrasound images. In fact, the system is constructed with this feature. Unlike Stockham s approach ^ where differs 
gains are imposed on the two components, we perform histogram modification of the lowpass component. This FOvides 
Later flexibility for enhancement. The histogram transformation used in this paper is nonlinear and has the profile sho 
£ Figure 4. It was observed experimentally that the features most difficult to preserve during encoding are ^Presented n 
the low and medium amplitude range of the lowpass component. Tims, contrast modification in this region is «pectedto 
enhance perceived quality. Preliminary results indicate that this is true. At this point enhancement results have not been 
evaluated by medical specialists, but hopefully will be by the time of the conference presentation. 


4 LOWPASS COMPONENT CODING 


Taking into account the way the lowpass component /„,< j was obtained, it can be represented by ¥(/„£j), 1S “ 
bandlimited signal with a cutoff frequency of */D, in both horizontal and vertical directions. Therefore. *F(/ n , 0 ) can be 
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Figure 3: Dependency on a of the two optimality criteria, (a) MSB of (i„, j - in.ij) as a function of a. (b) MSE of l 
as a function of a. 



Figure 4: Example of nonlinear histogram transformation for image enhancement. 





1 by D, in both horizontal and ventcal directions without loss of infonnation doc to riming Since the mapping 

and onto. !«,< j can be dectaated and reconsmtcted. By codtn 8 the decmaled vereton of j. the net 

bit rate can be reduced dramatically. 

One of the front-running image coding techniques is subband coding. 4 - 5 It refers to a broad class of systems where 
the input is decomposed into subband images and the subband images are coded for transmission or storage. In ttos o 
„ new optimized^ubband image coder is employed.’ This particular subband coding system j*>nsisu of decomposing 
the lowpass component into 16 uniform subbands using the All two-band analysis filters introduced in reference [*]. 
The implementation can be made very efficient computationally by using specially designed recursive filters that require no 
nraltiphcation operations.’ The subbands are then quantized using entropy-constrained multistage quantizers with intra-ban 
and inter-band conditioning. 

This subband coding system is described in detail in references [1] and [2], Hence our discussion of this part of the coder 
is brief Let it suffice to say that the subband coder is based on encoding each subband pixel (one quantization stage ata 
f using cinchtional entropy coding. The conditioning is based on the quantized symbol values in the loc^ neigh^rhc^ 
of the pixel and in corresponding locations across the subbands. Conditional entropy coding of this form allows statistical 
dependencies within and across subbands to be used to our advantage. 

In addition however, we also extend the conditioning to include corresponding pixels in previous frames. Implementation 
complexity limits the number of conditioning symbols that can be used practically, which lsunfortunate. Therefore on y 
the most statistically important conditioning symbols are used (the precise number being fixed a prion y imp emen 

» foiednumbet of conditioning symbols, an rigorithmdnn 6nds ibc locmton of cond.non.ng symbols such 

that the overall entropy is minimized is described in. 

Conditioning on previous frames is reasonable since there is a lot of correlation between consecutive frames 
after the noise has beenfiltered out in the nonlinear subband decomposition stage. This conditioning scheme is d«cn^in 
Figure 5 where only spatial conditioning is depicted. Inter-subband and inter-stage conditioning are not shown m the figure 
fSfry resons Solid lines represent intra-frame conditioning and dashed lines represent inter-frame conditioning. Note 
Zoning scheme retires a large number of previous frames to be buffered. However, this is not a big problem 

in our case, because the frames are small (64 by 64 pixels). 

This type of conditioning for cardiology ultrasound video can be used to exploit the fact that the image sequence is 
auasiperiodic with a period given by the heartbeat rate. Therefore, we can use conditioning based on s y mb « ls fro 'J* 
frameToaaedone heartbL period before the cuirent frame. A couple of techniques can be used for estimatingthe heartbeat 
oeriod Ideally we would choose the value that minimizes the average codeword in the current frame. Tlus method is 
computationally intensive. A simpler method is to use for conditioning the frame that 

itself and the current frame. However, ultrasound machine outputs often provide the EKG signal explicitly. Thus the 
simplest way is to extract the period directly from the accompanying EKG. 


5 TEXTURE CODING 


The texture component is a valuable part of the coded signal in the sense that it contributes to the nahnl Njpww ^ 
the reconstructed image. However, much of this texture component is just random noise. One can postulate ft* 1 * c 
J ££££ theexamined region is the same for a relatively long period of time, except when the sensor dev.« 
motion, and additive noise contributes most to the rapid time variations in the signal statistics. A 
the texture component is to decimate it in the temporal dimension. A texture frame is then only encoded 
once every D, frames. At the receiver, the same decoded texture frame is used for the synthesis of D t consecutive frames. 

For large values of D u this method may produce an unpleasant effect of smtic texture. In order to nto MM 
to have amore subjectively realistic decoded video sequence, two consecutively transmitted texture frames may be used 
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Figure 5: The inter-frame and intra-frame conditioning scheme 


alternatively. Alternating between these two texture frames every 1/30 seconds is an improvement but the flicker efforts i arc 

too strong Further subjective improvement is achieved by switching between them every two or ee / Woass 

Experiments have shoin that the best subjective quality is obtained when we switch texture frames every three lowp 
component frames. 

The same entropy-constrained quantization technique is used to encode the texture. The only difference is that condi- 
tioning is realized only with respect to neighboring pixels and neighboring subbands. This is because there is no sigmfican 
correlation between textures Di frames apart. 


6 EXPERIMENTS AND RESULTS 

If the lowpass component is encoded at R x bits per pixel and the texture at R 2 bits per pixel, the overall bit rate in bits 
per second is given by / 2 56x 256 p , 256 x 256 \ 

* = 30 -Ri + £> 2 - Ri ) • 

The lowpass component spatial decimation factor 13, can be equal to 8 if the coding system is used for posiuoningonly^ 
if we ^d diagnostic quality, 2 or even 1 if we implement a multiresolution system allowing zooming m the area of 
interest. The texture component temporal decimation factor is in the range 25 to 40. 

In Figure 7 we present an original ultrasound image (a), the corresponding 
image (c), and the reconstructed image with contrast enhancement (d). Enhmcemrat has been performed using the histogram 

transformation depicted in Figure 4 with the parameters L = 170 and f(L) - 210. 

. , ,• r, — 4/>, = 60J2 i= 0.45. Ri - 0.25. The overall bit rate is then 

R ^fsk^Tnbps = 65kbps, so this example can’be used for transmission over an ISDN line. We have used a uniform 
64-subband decomposition, and a quantizer having six stages and two code vectors per stage. 

Encoded video segments will be presented at the conference. 
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(a) ORIGINAL 




Figure 6: (a) original ultrasound image (b) reconstructed image with additive model (c) reconstructed image with multi 
plicative model. 





(a) ORIGINAL (b) ORIGINAL LOWPASS COMPONENT 



(C) RECONSTRUCTED IMAGE 


Figure 7: Encoded images at R\ = 0.45bpp and Ri — 0.25bpp. 
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